Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new benchmarks and mapper cache #412

Merged
merged 16 commits into from
Nov 16, 2023
Merged

Add new benchmarks and mapper cache #412

merged 16 commits into from
Nov 16, 2023

Conversation

akudiyar
Copy link
Collaborator

@akudiyar akudiyar commented Aug 10, 2023

This work is a part of the global task of decoupling the MessagePack object mapping from the core driver code in the scope of discoverable mapper bindings, caching, and support for different Tarantool cluster and API library implementations.

The goal of this patch is to reduce the number of Java reflection calls by introducing the concept of "request signatures" and replacing real-time converter lookup with memoization and table lookup.

New cluster-oriented benchmark scripts are also added. The comparison and memory consumption improvements based on the benchmark measurements will follow in the next PRs.

See the commit messages for more detailed information about the changes.

TODO:

  • Decouple mapping from the core driver workflow and move it up to the client implementation level
  • Implement request signatures
  • Change mapper references to suppliers in API
  • Implement mapper binding to request signatures
  • Perform benchmark results comparison

Depends on #410

@akudiyar akudiyar force-pushed the mappers-cache branch 4 times, most recently from f58893c to 7747033 Compare August 12, 2023 09:49
@akudiyar akudiyar force-pushed the mappers-cache branch 3 times, most recently from 9532ffe to 968960a Compare August 31, 2023 00:11
@akudiyar akudiyar force-pushed the mappers-cache branch 3 times, most recently from a0968b8 to bccef82 Compare October 8, 2023 20:46
@akudiyar akudiyar changed the title [WIP] Mapper cache PoC Mapper cache PoC Oct 16, 2023
@akudiyar akudiyar requested review from ArtDu, nickkkccc and iDneprov and removed request for ArtDu and nickkkccc October 16, 2023 20:19
@akudiyar akudiyar changed the title Mapper cache PoC Add new benchmarks and mapper cache Oct 16, 2023
@dkasimovskiy dkasimovskiy removed the request for review from iDneprov October 19, 2023 11:04
This patch is a part of the global task of decoupling the MessagePack object mapping from the core driver code in the scope of discoverable mapper bindings, caching and support for different Tarantool cluster implementations
Replace the result future with request metadata that will allow to
encapsulate more information about the request
…erences

This change is breaking and it is required for making on-demand mapper
instantiation only in cases if the corresponding mapper is not found in
the mapper cache (will be done later).

Also one incorrect call() variant is removed.
This change is breaking and it is required for making on-demand mapper
instantiation only in cases if the corresponding mapper is not found in
the mapper cache (will be done later).
This change is breaking and it is required for making on-demand mapper
instantiation only in cases if the corresponding mapper is not found in
the mapper cache (will be done later).
The actual values of the request arguments should not affect the
signature value if they are not String values - this is a special
exception to be able to include server function names and return types
as the signature components. For the other components only the order and
argument types matter for this purpose.
This change actually enables mapping request parameters to the
corresponding argument and result mappers. There is still some overhead
produced from the anonymous context-bound Function instances, but it
should be constant and significantly less than the overhead of building
and storing the mapper stacks on each request.
One more useful application of the request signatures is that it is now
possible to get more detailed information about a timed out request. The
internal request IDs that have been used before are completely useless
for determining what exact call has failed if a timeout has happened.
Fix NPE happening when the server responds too late with an error, and
the request is already failed with timeout and request metadata is
removed
Change the default signature creation algorithm to fix possible cache
pollution
For TarantoolSpace and TarantoolSpaceMetadata classes we have static
mappers that should be really instantiated once, as well as the
suppliers for them. At the call sites we use references to the methods
returning mappers, so they will be optimized further by JVM.
There was a problem with the previous approach if we were going to call
a polymorphic remote function that will return results with different
schema depending on the argument. That is the case for calling the
tarantool/crud functions for example - the schema depends on the space
metadata. But if we supply a result mapper for each call, for each
result schema it will be different. So the hashcode of the result mapper
supplier is now included in the request signature, as well as the
hashcode of the arguments mapper supplier.

Here we assume that if we have identical mappers for two calls, their
suppliers must be also identical. Therefore we can have two possible
problems:
 - cash pollution if a mapper supplier for the same mapper is created
 for each call,
 - mapping errors if the supplier is polymorphic and the same supplier
 instance is used for calls with different result schema.
Both problems actually relate to a somehow bad written code, so they
does not look like showstoppers. The second problem also looks like a
rare one, but to avoid sudden user application crash in case of the
first one we need perhaps to define the maximum cache size and establish
eviction for the cash entries.
The new benchmark tries to measure the following things:
 - cost of allocating a new space object for the proxy client
 - cost of writing data to two different spaces (with different mappers)
 - cost of reading data from two different spaces (with diferent mappers)

To get some useful results for memory profiling run this benchmark as
follows:

mvn exec:exec -Pbenchmark -Dbenchmark="ClusterBenchmarkRunner" -DbenchmarkArgs="-prof=jfr"

To analyze these results use the JMC plug-in for Eclipse. You may use
the other profiler types and pass additional arguments to the JMC tool.
@akudiyar akudiyar marked this pull request as ready for review October 29, 2023 20:28
@akudiyar akudiyar requested a review from ArtDu November 12, 2023 12:21
@dkasimovskiy dkasimovskiy self-requested a review November 16, 2023 08:15
@akudiyar akudiyar added this pull request to the merge queue Nov 16, 2023
Merged via the queue into master with commit dd694c9 Nov 16, 2023
3 checks passed
@akudiyar akudiyar deleted the mappers-cache branch November 16, 2023 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants